221 research outputs found
Unsupervised Deep Epipolar Flow for Stationary or Dynamic Scenes
Unsupervised deep learning for optical flow computation has achieved
promising results. Most existing deep-net based methods rely on image
brightness consistency and local smoothness constraint to train the networks.
Their performance degrades at regions where repetitive textures or occlusions
occur. In this paper, we propose Deep Epipolar Flow, an unsupervised optical
flow method which incorporates global geometric constraints into network
learning. In particular, we investigate multiple ways of enforcing the epipolar
constraint in flow estimation. To alleviate a "chicken-and-egg" type of problem
encountered in dynamic scenes where multiple motions may be present, we propose
a low-rank constraint as well as a union-of-subspaces constraint for training.
Experimental results on various benchmarking datasets show that our method
achieves competitive performance compared with supervised methods and
outperforms state-of-the-art unsupervised deep-learning methods.Comment: CVPR 201
GP-SLAM+: real-time 3D lidar SLAM based on improved regionalized Gaussian process map reconstruction
This paper presents a 3D lidar SLAM system based on improved regionalized
Gaussian process (GP) map reconstruction to provide both low-drift state
estimation and mapping in real-time for robotics applications. We utilize
spatial GP regression to model the environment. This tool enables us to recover
surfaces including those in sparsely scanned areas and obtain uniform samples
with uncertainty. Those properties facilitate robust data association and map
updating in our scan-to-map registration scheme, especially when working with
sparse range data. Compared with previous GP-SLAM, this work overcomes the
prohibitive computational complexity of GP and redesigns the registration
strategy to meet the accuracy requirements in 3D scenarios. For large-scale
tasks, a two-thread framework is employed to suppress the drift further. Aerial
and ground-based experiments demonstrate that our method allows robust odometry
and precise mapping in real-time. It also outperforms the state-of-the-art
lidar SLAM systems in our tests with light-weight sensors.Comment: Accepted by IROS 202
Non-intrusive load identification method based on GAF and RAN networks
Non-intrusive load identification can improve the interaction efficiency between the power supply side and the user side of the grid. Applying this technology can alleviate the problem of energy shortage and is a key technique for achieving efficient management on the user side. In response to the cumbersome process of manually selecting load features and the low accuracy of identification in traditional machine learning algorithms for non-intrusive load identification, this paper proposes a method that transforms the one-dimensional reactive electric signal of the load into a two-dimensional image using Gram coding and utilizes the Residual Attention Network (RAN) for load classification and recognition. By transforming the one-dimensional electrical signal into a two-dimensional image as the input to the RAN network, this approach retains the original load information while providing richer information for the RAN network to extract load features. Furthermore, the RAN network effectively addresses the poor performance and gradient vanishing issues of deep learning networks through bottleneck residual blocks. Finally, experiments were conducted on a public dataset to verify the effectiveness of the proposed method
Taking a Respite from Representation Learning for Molecular Property Prediction
Artificial intelligence (AI) has been widely applied in drug discovery with a
major task as molecular property prediction. Despite the boom of AI techniques
in molecular representation learning, some key aspects underlying molecular
property prediction haven't been carefully examined yet. In this study, we
conducted a systematic comparison on three representative models, random
forest, MolBERT and GROVER, which utilize three major molecular
representations, extended-connectivity fingerprints, SMILES strings and
molecular graphs, respectively. Notably, MolBERT and GROVER, are pretrained on
large-scale unlabelled molecule corpuses in a self-supervised manner. In
addition to the commonly used MoleculeNet benchmark datasets, we also assembled
a suite of opioids-related datasets for downstream prediction evaluation. We
first conducted dataset profiling on label distribution and structural
analyses; we also examined the activity cliffs issue in the opioids-related
datasets. Then, we trained 4,320 predictive models and evaluated the usefulness
of the learned representations. Furthermore, we explored into the model
evaluation by studying the effect of statistical tests, evaluation metrics and
task settings. Finally, we dissected the chemical space generalization into
inter-scaffold and intra-scaffold generalization and measured prediction
performance to evaluate model generalizbility under both settings. By taking
this respite, we reflected on the key aspects underlying molecular property
prediction, the awareness of which can, hopefully, bring better AI techniques
in this field
Gold-YOLO: Efficient Object Detector via Gather-and-Distribute Mechanism
In the past years, YOLO-series models have emerged as the leading approaches
in the area of real-time object detection. Many studies pushed up the baseline
to a higher level by modifying the architecture, augmenting data and designing
new losses. However, we find previous models still suffer from information
fusion problem, although Feature Pyramid Network (FPN) and Path Aggregation
Network (PANet) have alleviated this. Therefore, this study provides an
advanced Gatherand-Distribute mechanism (GD) mechanism, which is realized with
convolution and self-attention operations. This new designed model named as
Gold-YOLO, which boosts the multi-scale feature fusion capabilities and
achieves an ideal balance between latency and accuracy across all model scales.
Additionally, we implement MAE-style pretraining in the YOLO-series for the
first time, allowing YOLOseries models could be to benefit from unsupervised
pretraining. Gold-YOLO-N attains an outstanding 39.9% AP on the COCO val2017
datasets and 1030 FPS on a T4 GPU, which outperforms the previous SOTA model
YOLOv6-3.0-N with similar FPS by +2.4%. The PyTorch code is available at
https://github.com/huawei-noah/Efficient-Computing/tree/master/Detection/Gold-YOLO,
and the MindSpore code is available at
https://gitee.com/mindspore/models/tree/master/research/cv/Gold_YOLO.Comment: Accepted by NeurIPS 202
- …